Goto

Collaborating Authors

 regret and constraint violation


Online Convex Optimization with Stochastic Constraints

Neural Information Processing Systems

This paper considers online convex optimization (OCO) with stochastic constraints, which generalizes Zinkevich's OCO over a known simple fixed set by introducing multiple stochastic functional constraints that are i.i.d.





ProvablyEfficientModel-FreeConstrainedRLwith LinearFunctionApproximation

Neural Information Processing Systems

We study the constrained reinforcement learning problem, in which an agent aims tomaximize the expected cumulativereward subject toaconstraint on the expected total value of a utility function. In contrast to existing model-based approaches or model-free methods accompanied with a'simulator', we aim to develop thefirst model-free, simulator-freealgorithm that achieves a sublinear regret and a sublinear constraint violation even inlarge-scale systems.


ProvablyEfficientModel-FreeConstrainedRLwith LinearFunctionApproximation

Neural Information Processing Systems

We study the constrained reinforcement learning problem, in which an agent aims tomaximize the expected cumulativereward subject toaconstraint on the expected total value of a utility function. In contrast to existing model-based approaches or model-free methods accompanied with a'simulator', we aim to develop thefirst model-free, simulator-freealgorithm that achieves a sublinear regret and a sublinear constraint violation even inlarge-scale systems.



SimpleandFastAlgorithmforBinaryIntegerand OnlineLinearProgramming

Neural Information Processing Systems

Our algorithm employsonecolumn forsubgradient descent ineach iteration, whereas thedual project subgradient algorithm requires the whole constraint matrix and conducts matrix multiplication in each iteration. In addition, a class of backpressure/max-weight algorithms [25] are developed in the control/queueing literature and the backpressure algorithm can be interpreted from a view of pressuregradient.


Online Convex Optimization with Stochastic Constraints

Neural Information Processing Systems

This paper considers online convex optimization (OCO) with stochastic constraints, which generalizes Zinkevich's OCO over a known simple fixed set by introducing multiple stochastic functional constraints that are i.i.d.